Omen: discovering sequential patterns with reliable prediction delays

نویسندگان

چکیده

Abstract Suppose we are given a discrete-valued time series $$X $$ X of observed events and an equally long binary sequence $$Y Y that indicates whether something interest happened at particular point in time. We consider the problem mining serial episodes, sequential patterns allowing for gaps, from reliably predict those interesting events. With reliable mean not only event is likely to follow, but can also accurately tell how until will happen. In other words, specifically interested with highly skewed distribution delays between pattern occurrences predicted As it unlikely single explain complex real-world progress, after smallest, least redundant set such together well. formally define this terms Minimum Description Length principle, by which identify best as describe most succinctly data over . neither discovering optimal explanation patterns, nor discovery problems allow straightforward optimization, break two propose effective heuristics both. Through extensive empirical evaluation, show both our main method, Omen , its fast approximation fOmen work well practice quantitatively qualitatively beat state art.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constraint Relaxations for Discovering Unknown Sequential Patterns

The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns. In this paper, we ...

متن کامل

Discovering Sequential Patterns from Non-Uniform Databases

Databases, when used to collect data in the dynamic world, change over time. As a consequence, di erent parts of a database that come into the database at di erent times may contain activities with di erent characteristics. In other words, the data in the database are \non-uniform". This non-uniformity of the database gives rise to the idea of a Divide and Conquer strategy that divides a very l...

متن کامل

Discovering Novelty in Gene Data: From Sequential Patterns to Visualization

Data mining techniques allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyse by end-users. In this paper, we focus on sequential pattern mining and propose a new visualization system, which aims at helping end-users to analyse extracted knowledge and to highli...

متن کامل

Fast Algorithms for Discovering Sequential Patterns in Massive Datasets

Problem statement: Sequential pattern mining is one of the specific data mining tasks, particularly from retail data. The task is to discover all sequential patterns with a user-specified minimum support, where support of a pattern is the number of data-sequences that contain the pattern. Approach: To find a sequence patterns variety of algorithm like AprioriAll and Generalized Sequential Patte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge and Information Systems

سال: 2022

ISSN: ['0219-3116', '0219-1377']

DOI: https://doi.org/10.1007/s10115-022-01660-1